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ABSTRACT 

We present a set of ultra-large particle-mesh simulations of the Lyman-a forest targeted at understanding the 
imprint of baryon acoustic oscillations (BAO) in the inter-galactic medium. We use 9 dark matter only simula- 
tions which can, for the first time, simultaneously resolve the Jeans scale of the intergalactic gas while covering 
the large volumes required to adequately sample the acoustic feature. Mock absorption spectra are generated 
using the fluctuating Gunn-Peterson approximation which have approximately correct flux probability density 
functions (PDFs) and small-scale power spectra. On larger scales there is clear evidence in the redshift space 
correlation function for an acoustic feature, which matches a linear theory template with constant bias. These 
spectra, which we make publicly available, can be used to test pipelines, plan future experiments and model 
various physical effects. As an illustration we discuss the basic properties of the acoustic signal in the forest, 
the scaling of errors with noise and source number density, modified statistics to treat mean flux evolution 
and mis-estimation, and non-gravitational sources such as fluctuations in the photo-ionizing background and 
temperature fluctuations due to Hell reionization. 

Subject headings: methods: YV-body simulations — cosmology: large-scale structure of universe 
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l. introduction 

Oscillations of the baryon-photon plasma in the early uni- 
verse, also known as Baryon Acoustic Oscillations (BAO), 
imprint a dist inct signature on the clustering of matter 
dPeebles & Yulfl^lSunyaev & Zeldovichl[T970D which pro- 
vides a "standard ruler" by which we can measure the ex- 
pansion history of t he Universe (see lEisenstein & Hull 19981: 
Meiks irTet alj|1999l for a d etailed description of t he physics 
in modern cosmologies and lEisenstein et aTll2007l for a com- 
parison of Fourier and configuration space pictures). These 
oscillations have been traditionall y measured in the C osmic 
Microwave Background (CMB; see lJarosik et al.l2 010, for the 
latest results) but with the advent of new, large-volume galaxy 
redshift surveys BA O have been detected i n galaxy cluster- 
ing at low z as well (lEisenstein et all 12005b ICole et al l 12005 1 : 
Htitsil 120061: [Blake et alJ l2007t iPadmanab han et al.1 12007a: 
Perc ival et alJl2007l 120091) . 



In principle, the BAO technique becomes even more power- 
ful as one moves to higher redshift, where there is much more 
volume available to be surveyed and where the acoustic scale 
is more deeply in the linear regime. This latter fact helps in 
two ways. First, the power spectrum or correlation function 
can be computed quite accurately with only linear perturba- 
tion theory once one specifies the baryon-to-photon ratio and 
matter-radiation ratio. These are both measured ac curately 
from CMB acoustic peaks (see I White & Cohnll2002l for a re- 
view) so we have a template with which to fit the data. Sec- 
ond, the acoustic oscillations are less damped by non-linear 
evolution providing more modes which contain measurable 



signal. 

Tracing the enormous volumes required with galaxies re- 
quires a heavy investment in telescope time. However, in 
principle any tracer of the mass field will do, includi ng the 
neutral hydr ogen in the inter-ga lactic medium (I Whitd 120031) 
or galaxies (Chang et al. 2008). Tracing neutral hydrogen 
in galaxies via its redshifted 21cm emission is a key goal 
for proposed future radio telescopes. However, even with 
current technology it is relatively straightforward to obtain 
a low resolution spectrum of distant quasars and study the 
Lyman-a forest of absorption lines which map the neutral 
hydrogen along the line-of-sight. At z — 2 - 3 the gas mak- 
ing up the inter-galactic medium (IGM) is thought to be in 
photo-ionization equilibrium, which results in a tight density- 
temperature relation for the absorbing material with the neu- 
tral hyd rogen density proportional to a pow er of the baryon 
density dHui & Gnedinlll997HMeiksinll2007l) . Since pressure 
forces are sub-dominant, the neutral hydrogen density closely 
traces the total matter density on large scales. The structure in 
QSO absorption thus traces, in a calculable way, slight fluc- 
tuations in the matter density of the universe back along the 
line-of-sight to the QSO, with most of the Lyman-a forest 
arising from over-densities of a few times the mean density. 

Motivated by the u pcoming Baryo n Oscillation Spectro- 
scopic Survey (BOSS; Schle gel et al.ll2009l a part of SDSS- 
III), which will deliver an unprecedented number of quasar 
spectra probing the Lyman-a forest at z ~ 2 - 3, and ac- 
cess to the world's fastest supercomputer, "Roadrunner", we 
have produced a set of ultra-large particle-mesh simulations of 
cosmological structure formation to further investigate BAO- 
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Lyman-a science. In this paper we describe our simulations, 
which simultaneously resolve the Jeans scale and the acoustic 
scale, and the construction of mock quasar spectra with prop- 
erties close to those observed at z ~ 2 — 3 (Sec.|2]i. We present 
several examples to illustrate the use of these spectra in under- 
standing BAO-Lyman-a science, describing the 2-point cor- 
relation function of the Lyman-a forest flux in Sec. [3] and 
the impact of two non-gravitational signals (fluctuations in 
the ionizing background and Hell reionization) in Sec. [4] We 
conclude in Sec. [5] In the hope that these spectra can be more 
generally useful, we have made them publicly available^. 

2. SIMULATIONS 
2.1. Particle-mesh simulations 

The study of the BAO signal in the Lyman-a forest is dif- 
ficult both analytically and numerically. Much of the signal 
comes from inter-galactic gas at near mean density, impos- 
ing strong requirements on the mass resolution of any sim- 
ulation. The gas contains structure down to the Jeans scale, 
O(lOOkpc), and resolving this structure while simulating the 
extremely large volumes required by BAO demands impres- 
sive dy namic range. As p art of the "Roadrunner Universe" 
project (Habib et al. 2009) we have run a set of ultra-large 
particle-mesh simulations, in order to resolve the Jeans scale 
in a large cosmological volume. 

In particular we have run 9 particle-mesh simulations of a 
flat ACDM cosmology, with U, M = 0.2 5, fi A = 0-75, h = 0.72 
n = 97 and a 8 = 0.8 (i.e. model of iHeitmann et al.ll2008l 
2009). All of the simulations were started at z = 211 from 
an initially regular Cartesian grid of particles, displaced ac- 
cording to the Zel'dovich approximation. Each simulation 
evolved 4,000 3 particles in a 750/r'Mpc box (particle mass 
5 x 10 8 /i _1 M Q ) computing the forces on a 4,000 3 grid. For 
a description of th e particle-mesh code, and code tests, see 
lHabib etail d2009f) . The correlation function of the mass at 
z ~ 2.4 is shown in Figure[T] compared to the theoretical pre- 
diction. 

These simul a tions are broadly similar to those reported 
in ISlosar et all (120091) . However those simulations, which 
were focused on the large-scale BAO signal, did not prop- 
erly resolve the small-scale power. While this should not 
bias the shape of the flux correlation function on large scales, 
it does affect the distribution of pixels and the large-scale 
bias, which in turn affects the signal-to-noise of any mea- 
surement. Further massaging of such low resolution spectra 
is thus required before they can be used for pipeline tests, 
to investigate signal-to-noise scaling, or as input to Monte- 
Carlo simulations. The larger dynamic range in mass and 
force resolution of the Roadrunner simulations allows us to 
maintain high resolution (comparable to p r evious authors, e.g. 
Meik sin & Whitdl200U jMcDonaldll 2003l: iMandelbaum et al] 
2003t iMcDonald et alTl2005l) while simulating a relatively 
large computational volume. Not even our simulations can 
resolve lOO/z^kpc over the full 1.5/r'Gpc simulated in 
ISlosar et"aT] d2009l) . so we must compromise on either vol- 
ume or resolution. We have chosen to maintain hi gh force and 
mass r esolution by running smaller boxes than in Slos ar et alJ 
(2009) and build up total volume by performing several real- 
izations. 

While the total volume can be enlarged by running many 
individual simulations, if each has too small a box there can 
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FIG. 1. — The real-space mass correlation function, £{(>), with bootstrap 
errors. The solid line shows the prediction of linear theory, convolved with a 
Gaussian of 3/r'Mpc (see j|3.2t . 

be systematic effects introduced into the measurement of the 
acoustic scale. At the high redshifts of interest for Lyman- 
a forest work the missing long-wavelength modes in a finite 
box simulation do not systematically alter the evolution of the 
existing modes — the fundamental mode of the box is well in 
the linear regime. However the finite support of the £-modes 
implied by periodicity may alter the correlation function on 
the scales of interest. To test for this we generated correla- 
tion functions by Fourier transforming power spectra sampled 
only on the £-grid allowed by periodic boxes of a range of 
sizes and compared them to the L\, ox —> oo limit. The largest 
effect is an almost scale independent suppression of £(r) in 
the smaller boxes, but this becomes negligible once the box 
is larger than 500/r'Mpc. For our choice of 750/r'Mpc, 
the correlation function at r ~ 10 2 /i _1 Mpc is indistinguishable 
from that in a much larger box. Our choice of box size and 
resolution is thus adequate for studying the BAO signature in 
the Lyman- a forest in some deta il. 

Recently Nor man et al.1 (120091) have reported fully hydro- 
dynamic simulations covering slightly smaller volumes with 
slightly worse force and mass resolution than the simulations 
reported here. The advantage of these simulations is that 
they attempt to treat the baryonic physics more accurately, but 
with a concomitant increase in computational complexity and 
hence decrease in dynamic range. 

2.2. Mock spectra 

From the phase space data for the particles we produced 
'skewers' through the simulation cube of density and line-of- 
sight velocity at z = 2.25, 2.4, 2.5 and 2.75. The skewers are 
distributed at random across the face of the cube, and run par- 
allel to the sides of the box. 

The density and velocity fields were smoothed with a fil- 
ter of width 100/r'kpc to approximate the effects of thermal 
pressure forces on the gas. Such a scheme clearly ignores 
much of the complexity of gas processes on small scales, but 
comparison with hydrodynamic simulations indicates it re- 
tains many of the features of interest for the Ly-a forest (see 
Meiksin 2007, for a review). For an isothermal gas at mean 
density the (comoving) Jeans filtering scale is 
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with c s = (5/3 ksT /m) x l 2 and in = 0.588 However this is 
the appropriate smoothing scale only if the gas temperature 
scales as (1+z). While the observational situation is uncer- 
tain, it appears that the gas temperature scales more slowly 
than this, and t he comoving Jeans s cale increases with time. 
For this reason Gnedin & Huil (11998b argue for a smoothing 
scale set by the Jeans scale at an earlier epoch, specifically ~ 2 
times smaller than the Jeans scale at the time of observation. 
Mod eling the filtering a s a Gaussian and as s uming 2 x 10 4 K 
gas dRicotfi et all 120001: iSchaye et all 120001: McDona ld et al 
l200UlZaldarriaga et alJfeOOlatlTheuns et alj|2002t iLidz et al 
120091) gives a = 0.14/z _1 Mpc at z = 2 and a = 0.12/z _1 Mpc at 
Z = 3. Given the uncertainties, and the approximate treatment 
of this physics, we shall assume a = 100/i _1 kpc in what fol- 
lows. 

For numerical reasons we used a spline kernel dDehnenl 
2001) rather than a Gaussian. The spline kernel approximates 
a Gaussian well in the core, but vanishes identically beyond a 
range, R. A good match is found when R = 3.25 tr, so we adopt 
R = 0.325 /r'Mpc (er ~ lOO/r'kpc) which is larger than our 
mesh scale and mean inter-particle spacing. We have checked 
that increasing the resolution does not appreciably change the 
spectra when the density field is smoothed on these scales, 
indicating that we are numerically converged. 

For each of 22,500 randomly placed lines-of-sight per 
box the fluctuating Gunn-Peterson approximation ( FGPA; 
ICroft et all [l99l IGnedin & Huil [1991 |F~ 



Aq Xa Xl3~Xa H(z) dX/dx dv/dx 



iMeiksinl 120071) was 
used to generate skewers of optical depth with 4,000 pixels 
ea ch. We assumed a temperature at mean den sity of 2 x 10 4 
K dMcDonaldetalJl200l(: iTheuns et all 120021) and equations 
of state running from 7 = 0.5 to 7 = 1 .5. Different choices for 
the slope, even the inverted equation of state (7 < 1), quanti- 
tatively but not qualitatively change our conclusions. The op- 
tical depth included thermal broadening (assumed Gaussian) 
and skewers are generated both with and without peculiar ve- 
locities for the gas. 

The optical depth was scaled so that the mean transmitted 
flux F = (e xp(-r)) approximately m atched that of the data 
compiled in Meiksin & White) d2004l) : 



-h\F = 0.00211 (1+z) 



3.7 



(2) 



valid for 1.2 < z < 4. We impose this mean flux condition 
over the entire volume. As described later, we also generate 
skewers in which F changes across the volume to understand 
the effect of incomplete modeling of the mean flux evolution. 
Another way to set F is via the amplitude of the flux power 
spectrum. Conveniently, these two methods agreed quite well. 
For completeness, we also generate the skewers with dark- 
matter over-density only, so we can compare the flux statistics 
to those of the underlying mass. 

We work throughout with relative fluctuations in the flux, 
6f = F(x)/F - 1, so our fundamental data set is <5f(x) on 
22,500 skewers of 4,000 pixels each per box. Our 9 simula- 
tions have V ~ 3.8 (/r'Gpc) 3 and cover approximately 1,000 
sq.deg., or 10% of the coverage planned for BOSS. On the 
other hand the line-of-sight areal density (~ 200 per sq.deg. at 
z = 2.5) is much larger than anticipated from BOSS. This 
allows us to study the impact of quasar number density on 
Lyman-a forest studies for future missions. Of course any 
observational program will likely analyze the data in small 
shells of redshifQ over which the evolution of e.g. the mean 

2 Since our simulation boxes are dumped at constant time they can also 
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TABLE 1 

Some useful conversion factors for our cosmology. For 
each redshift, z, we list the observed wavelength [x a ] of ly-a 

emitted at that z, the comoving distance to that redshift 
[xa], the extra comoving distance to ly-/3 if ly-a is at that z 
[X/3 _ Xa ] , the Hubble parameter [H(z)] and conversions 

BETWEEN DISTANCE AND WAVELENGTH [d\/dx\ AND VELOCITY 

[dv/dx\- Distances are measured in Mpc, velocities in km/s 

AND WAVELENGTHS IN A. 

flux, is small. A survey such as BOSS should also be able to 
detect the evolution in H(z) across the redshift range 2 < z < 3 
by the shift in the acoustic feature in velocity space. 

2.3. Properties of the simulated spectra 
At the fiducial redshift of our box, z ~ 2.5, the Ly-a fea- 
ture is redshifted to approximately 4, 000 A, and the box has 
a velocity "length" of 73, 000 km/s. Each of our mock spectra 
thus encompasses the full Ly-a to Ly-/3 region for QSOs at 
z ~ 2.5 (Table Q]). For z — 2-3 a comoving /r'Mpc is ap- 
proximately equal to 1A (Table [U, so with 4,000 pixels per 
spectrum, each simulation pixel is 0.2 /i _1 Mpc wide. For com- 
parison each SDSS-III pixel is about 75 km/s (or just under 
1 /i _1 Mpc), so our spectra are comparably well resolved. 

We normalize the optical depths so that the mean trans- 
mitted flux matches observations. An additional constraint 
then comes from comparing the flux variance of the simula- 
tions and observations. For 7 = 0.5- 1.5 we find o\ =0.1, 
which is very compara ble to the measurements reported in 
iMcDonald et all d2000t) . Figure |2] shows the flux PDF in the 
simulations compare d to that measure d from high resolution 
spectra at z ~ 2.4 in lKim et aT] d2007l) . We see that the dis- 
tribution of high and low absorption regions is approximately 
correct, indicating that the spectra could be useful for testing 
analysis pipelines and investigating the impact of systematic 
errors. While the discrepancies are larger than the quoted ob- 
servational error bars, we note that our modeling could be im- 
proved and the flux PDF is notoriously difficult to measure 
observationally (especially in regions of low absorption). We 
have not attempted a more detailed comparison since the level 
of agreement will be sufficient for our purposes. 

The line-of-sight power spectrum o f the flux at z — 
2.4, compared to the measurements o f ICroft et alj d2002l) . 
iKim et al) (120041) and IMcDonald et alj d2006l) . is shown in 
Figure [3] There is some tension between the flux PDF and 
the power spectrum in the preferred value of F: the flux 
PDF is better fit if we slightly raise F, increasing the number 
of lines of sight with little or no absorption and decreasing 
the number with little or no transmission. The flux power 
spectrum is better fit if we lower F, raising the amplitude 
of Pp(k). For F ~ 0.8 the overall the level of agreement is 
good in both statistics, indicating that the relative amounts of 
power on different scales are approximately as observed in the 

be thought of as effectively larger sky area of "thin" redshift slices. However 
the limited box size means different subregions of the boxes are not fully 
independent. 
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FIG. 2. — The flux PDF in simu lations (lines) and as measured in high reso- 
lution spectra by Kim et al. (2007, squares). Overall the simulations do a fair 
job of describing the flux PDF, containing approximately the right distribu- 
tion of low and high absorption regions. 
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FIG. 3. — The (dimensionless) line-of-sight powe r spectrum of the flux 
measure d from the simul ations at z — 2.4 and fr om Croft et al. (2002, tri- 
angles), Kim et al. ( 2004, circles) and McDonald et al. (2006, squares). As 
in Figure|2] line types indicate the assumed equation-of-state: 7 = 0.5 (long- 
dashed), 7 = 1.0 (short-dashed) and 7 = 1.5 (dotted). The arrow indicates 
roughly the range of scales accessible from R = 2, 000 spectra such as will be 
produced by BOSS. 

real Universe. We have chosen to match t o the later, SDSS- 
based measurements of iMcDonald et all (120061) r ather than 
the sli ghtly higher results of ICroft et al.1 (120021) : iKim et alJ 
(2004). Our spectra thus contain approximately the right 
distribution of fluxes and approximately the right amount of 
small-scale structure, which acts as a source of "noise" in the 
measurement of the acoustic scale. 

In what follows we will work with Sf, ignoring real- 
world issues such as continuum fitting or subtraction, damped 
Lyman-a systems and metal lines. We expect the damping 
wings and metal lines to be uncorrected with the signal of in- 
terest, and so not produce a feature at the acoustic scale. Simi- 
larly, we will be looking at cross-correlation statistics between 
different lines-of-sight. Since several million QSO pairs con- 
tribute to the flux 2-point fu nction, ^At^q) , the fluctuations 
will rapidly average to zero (IViel et al. 2002). 

3. TWO POINT STATISTICS 



3.1. The correlation function 

We compute the density-density or flux-flux (cross-) corre- 
lation function as a direct sum over pixel pairs, omitting pixel 
pairs in the same line-of-sight. We will focus on isotropically 
averaged statistics - due to our limited volume we do not have 
a statistically significant detection of the quadrupole [£2] in 
the acoustic peak regioifl We use 3/z _1 Mpc bins, such that 
the effect of binning changes £ by less than 1 % in the region 
of the acoustic peak. The binned correlation function for any 
model can be as easily computed as the unbinned correlation 
function, e.g. 



d 3 k 



P F (k)Wi(k) 



(3) 



(2tt) 3 

where Pp is the flux power spectrum and 

sin(x + ) - sin(x_) - x+ cos(x + ) + x_ cos(x_) 
Wi(k) = 3 — — , , — (4) 

x+ — x_ 

with x± = k(r ± Ar/2). As Ar -)• we have W* — > jo(kr) = 
sin(kr)/(kr). Larger Ar leads to increased damping as k — > 00 
and a very slight increase in the degree of correlation between 
adjacent bins. For Ar — > 0, infinitesimal pixels and white 
noise, the error on a given £f(r) is infinite. The noise con- 
tribution to a binned statistic is however finite. If our r bins 
are larger than our pixels (as in our case) then it is the bin 
width which tames the noise - any results on e.g. S/N are thus 
dependent on the bin size. 

3.2. Template 

In general the flux correlation function (Figure|4]l can be re- 
lated to the non-linear, redshift-space, mass correlation func- 
tion, £,„(r) as 

&(r) = B(r)Ur)+Mr) (5) 

for some smooth functions B(r) and A(r). 

A first approximation to £„, is to take a multiple of the 
real-space mass correlation function predicted by linear the- 
ory. The dominant effect of non-linear clustering is to 
broaden the acoustic peak, with an amplitude tha t can be es 
timat e d from the rms Zel'doyich displacement {Bharadwa 
| 1996t lEisenstein et al.1 l2007t ICrocce & Scoccimarrol |200T 
iMatsubaral 120081) . At z = 2.5 this is about 3/r'Mpc in our 
cosmology, to be compared to the much larger intrinsic width 
of the acoustic feature (set by the diffusion, or Silk, damping 
scale: 12/r'Mpc). Figure Q] shows the mass correlation func- 
tion measured in the simulations (with errors determined from 
bootstrap as described below) compared to this smoothed lin- 
ear theory. The agreement is very good over a broad range of 
scales (recall the errors are highly correlated). 

On the large scales of interest here redshift space distortions 
simply multiply the real-space, mass correlation function by 
1 + (2/3)/+ (1/5) f* - where f = d\n8/d\na ~ p °J ~ 1 is 
the growth factor dKaiserl [T987I: lHamirtonl["l992l) - and ap- 
ply smoothing of the same order as above. Adding these in 
quadrature we see non-linear evolution and redshift space dis- 
tortions will only change the peak width by ~ 5%. A reason- 
able approximation to £ m (r), therefore, is the linear theory, 
real-space, mass correlation function multiplied by a constant 
and convolved with a Gaussian of 4/! _1 Mpc. 

3 We do see evidence at smaller r for a lower quadrupole-to-monopole 
ratio than in the case of the mass. This is expected due to the mapping from 
density to flux. 
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FIG. 4. — The redshift space flux correlation function, £p, with bootstrap 
errors. The solid line shows linear theory, convolved with a Gaussian and 
multiplied by a constant bias of b 1 = 0.2 2 (see Figure|5j. 
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FIG. 6. — The fractional error on £p as a function of scale for different 
choices of sight-line areal density (per sq.deg.). Since the errors can be noisy, 
we smoothed them over a bin of width 10/r'Mpc. Note the fractional error 
is larger when £p is small, near r ~ 90/r'Mpc, and smaller near the peak 
in £p at t — 110/r'Mpc. The solid line shows the expectation for Gaussian 
fluctuations. 
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FIG. 5.— The flux bias, B 2 (r) = as a function of scale. While the 

results ar e noisy, the b i as is c onsistent with being scale-independent as noted 
earlier bv lSTosar et alj i2009h . 

ISlosar et all ((2009) found B(r) was approximately scale- 
independent, suggesting a good template for the flux correla- 
tion function is the linear theory, real-space, mass correlation 
function multiplied by an effective bias - shown as the solid 
line in Figure[4] Our simulations are consistent with this find- 
ing (Figure [5]), but since we have less volume our constraints 
are not as strong. 

3.3. Covariance 

To assign error bars to the points we can use eith er the box- 
to-box scatter or a bootstrap scheme (Efron 1982). We tried 
two different methods for determining the bootstrap errors. 
The first was simply to draw boxes randomly (with replace- 
ment) and compute the average correlation function from the 
boxes. In the second we randomly assigned each skewer to 
one of 10 bunches within each simulation and used boot- 
strap resampling on the full collection of bunches in all of 
the boxes. Within any individual box the bootstrap errors will 
drastically underestimate the covariance on large scales, how- 
ever in the limit that we have a large number of independent 



boxes and the skewer bunches are uncorrelated it will provide 
a good estimate of the error from the finite survey volume 
(which we expect to dominate). We find that the box-to-box 
variance and first bootstrap scheme provided consistent and 
more stable indicators of the large-scale errors than the sec- 
ond bootstrap. This suggests that with observational data one 
wants to bootstrap on disjoint regions of the sky rather than on 
individual quasar sight-lines. It may also be helpful to apply 
a high-pass filter to make the individual reg ions more inde- 
pendent (see e.g. Padmanabh an et alJ [2009. for discussion), 
though we do not need to do this in our simulations. 

Our 9 boxes cover approximately 1 , 000 sq.deg., or roughly 
10% of the area planned for BOSS. In this regime the errors 
scale as area" 1 / 2 , so if all BOSS quasars probed the Lyman- 
a forest at a fixed redshift, and if noise were not an issue, 
it should achieve errors ~ 3x smaller than those plotted in 
Figure [6] at similar number densities. 

Using this scheme we can look at the dependence of the 
covariance on the number of lines-of-sight per square degree 
and on the noise in the individual spectra. Here we vary each 
of these degrees of freedom independently, allowing us to dis- 
entangle the contributions from finite volume, finite sampling 
and noise. Observationally the only way to achieve a high 
number density of sources is to go further down the luminos- 
ity function, thus at fixed exposure time the signal-to-noise 
will generally be lower at higher number density. Furthermore 
the signal-to-noise will usually vary across the spectrum. A 
detailed study of these issues, or an optimization for a given 
system, is beyond the scope of this paper. 

Figure [6] shows the dependence of the error on £p on the 
density of skewers. Since the error on the error is quite large 
from our small number of simulations we have smoothed the 
error with a sliding window of 3 bins in making this figure. 
Figure [6] shows that increasing the areal density of sight-lines 
from 25 to 50 per sq.deg. results in better determination of 
£f in the acoustic peak region. However the gains saturate 
between 50 and 100 quasars per sq.deg. and going to 200 
quasars per sq.deg. results in no improvement as the error is 
dominated by the finite volume surveyed. 
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These trends can be readily understood by considering the 
case of a Gaussian random field which is sampled along a 
large but finite number of 'skewers'. The observed power 
spectrum is related to the underlying 3D power spectrum 
through a window function. For a single k-mode the variance 
is unaffected by this step, but when considering angularly av- 
eraged quantities [e.g. P(\k\) or £(r)l the erro r is increased 
dMeiksin & Whitd Tr999; Ham ilton et ail 120061) . In the limit 
that the skewers are placed randomly one can show that (see 
Appendix) 



Var[/V] = 



(6) 



./=<> 



where n is the number density of sightlines and the vj involve 
the power spectrum and its integral in a k shell, k-Ak/2 < 
|k| < k+Ak/2. The lowest order term, v = 2Pj/Nk, is the 
familiar result with Nt the number of modes in the shell 
dFeldman et al.|[l994l) . The ratio vi/vo controls the number 
density at which we achieve essentially sample variance lim- 
ited constraints, with the relevant figure of merit depending on 
the ratio of the achieved line-of-sight (i.e. 2D) quasar number 
density to a "critical" value (near 0.01 h 2 Mpc~ 2 ) as described 
in the Appendix. For observations at z ~ 2 this critical num- 
ber density is between 50- 100 quasars per square degree (de- 
pending on the redshift range considered), which is where we 
see diminishing returns in Figure [6] 

While quasar counts at such faint limits are quite uncertain, 
the critical number density corresponds to quasars brighter 
than a B-band magnitude of about 22. Since it is unlikely that 
all such quasars in the range 2 < z < 3 could be successfully 
targeted, and since not all of the forest region will be useful for 
cosmology in all quasars, the relevant sample limits should be 
fainter. At these magnitudes going one magnitude fainter ap- 
proximately doubles the number of quasars per square degree 
(in a fixed, depth ~ 500/r'Mpc, redshift slice) and going two 
magnitudes fainter increases the density by a factor ^3.5. 

In the limit of infinite sampling density the error approaches 
the sample variance limit, which for £(r) can be written as 



Cov[&£y] =| J PjikWmWjik) 



(7) 



where W, indicates the window function for bin i (Eq. |4| and 
we have assumed the monopole dominates the higher multi- 
poles. The relevant volume is V = 3.8 (/r'Gpc) 3 and the plot- 
ted error in Figure [6] is the square root of the diagonal entries 
of this matrix. For a large number of skewers the simulation- 
based errors are in reasonable agreement with the Gaussian 
prediction in the region of the acoustic peak, however the re- 
sults are still somewhat noisy. The results become increas- 
ingly noisy for the off-diagonal elements of the covariance 
matrix, so we have not attempted a quantitative comparison. 
The Gaussian prediction for our 3/r'Mpc bins is that £ at 
the peak (r ~ 110/r'Mpc) is correlated with r = 50, 90 and 
100/r'Mpc at the 30, 80 and 90 per cent level. 

Figure [7] shows the effects of adding noise to the spectra. 
We use 100 quasars per sq.deg., since at that areal density the 
volume is well sampled, and add uncorrelated Gaussian noise 
to each pixel such that the S/N is unity per /i _1 Mpc, or very 
close to unity per A. It is clear from Figure|7]that noise at this 
level is a non-trivial contributor to the total error budget at this 
number density and volume. Recall, as discussed above, that 
in the absence of binning or finite pixels the white-noise error 
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FIG. 7. — The redshift space, flux correlation function for the spectra with 
no noise (squares) and including noise with S/N = 1 per ft _1 Mpc (which is 
close to 1 per A; triangles). The errors are estimated by bootstrapping the 9 
boxes. 

on £ at any r is infinite. Since our r-bins are larger than our 
pixels it is this size which tames this infinity, so the precise 
results depend on the chosen binning. 

3.4. Broad band power & compensated statistics 

One of the advantages of measuring the correlation 
function, rather than the power spectrum, is that it is 
straightforward to compute for even complicated geometries 
dSlosar et al.l 120091) . However when estimating the correla- 
tion function one needs to pay attention to the mean flux in 
the survey (equivalent to the mean galaxy density, or "integral 
constraint" in the case of galaxies). This is not an issue for es- 
timates of the power spectrum, which neatly immunizes high 
k power from slowly varying signals such as a_ mean-density 
or mean-flux error. A (small) misestimate of F in the power 
spectrum leads to a (small) misestimate of the amplitude of 
Pp. In the correlation function however an error in the mean 
flux generates both an additive and multiplicative error, 



r est 



Ftrue 

•Pest 



(8) 



and the additive term can swamp the desired correlation func- 
tion at large scales. If the mean flux evolves with redshift (or 
distance) the situation is slightly more complicated. However 
the limit where the mean flux for any pair of points is the 
same (e.g. the separation of the two points is transverse to the 
line of sight or the separation vector is much smaller than the 
characteristic scale over which F varies) is a simple general- 
ization of the above. If £p is %- or z-independent (as it is in 
our boxes) and the measured mean flux is F, we have 



£obsW = ■ 



F 1 



(9) 



where (• • •) indicates an average over distance or redshift. The 
general case is more complex, involving additional correlators 
of the mean flux and Sp. 

To handle this one can eith er fit out a smoothly varying 
piece of £f ( as advocated in e.g. lSeo & Eisensteinl2005Ll2007t 
IWhitdl2005l to handle galaxies) or compute a statistic which 
automatically reduces the sensitivity to slowly varying modes 
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FIG. 8. — The flux correlation function, £p (thin lines), and compensated 
correlation function, A£f (thick lines), for situations where the mean flux 
varies linearly across the box with an end-to-end amplitude of 0, 0.01, 0.02 
and 0.04 about a fixed mean flux at the box mid-plane. 



(g-g. iHuffetal] 120071 iPadmanabhanetal] [2007b; X uetal] 
l2010l) . 

Figure [8] shows the "extra" correlation that is introduced by 
a mean flux which varies linearly (by 0, 1, 2 and 4%) with 
distance across the box and that much of this is taken out by 



Afr(r) = (£ F «r))-frW 



(10) 



where the first ter m indicates the a verage of the correlation 
function within r dHuff et alj 120071) . Note that in A£ F the 
acoustic "peak" becomes an acoustic "dip". We found that 
the change in A£ F with increasing mean flux variation is not 
well fit by a constant multiplicative or constant additive fac- 
tor. Interpreted as a multiplicative change, the ratio is 1 below 
~ 50/i _1 Mpc, grows smoothly to a factor of 1 .4(4) at the peak, 
and then drops again beyond the peak for mean flux evolution 
of 1(4)% across the box. The larger the mean flux evolution, 
the broader the effect in r and for all cases the ratio peaks at 
the acoustic dip (r ~ 1 10/r'Mpc) and thus alters the dip loca- 
tion very little. If unaccounted for such evolution would lead 
to poor fits between the theoretical template and the observa- 
tions, but the acoustic signal survives in the measurement. 

Assuming Eq. (ffjl, to limit the flux evolution to less than 
1 % we would need to analyze the spectra in shells of redshift 
Az < 0.05 and for 4% the shell would have to be Az < 0.2. 
Alternatively we could imagine including a model for the 
mean flux evolution in our theoretical prediction or, since we 
know the mean flux evolution is in the line-of-sight direc- 
tion it may be possible to isolate this piece by appealing to 
a model for the angular dependence of £f(r). Since such a 
model should also handle redshift space distortions we defer 
this to a future publication. 

4. NON-GRAVITATIONAL CONTRIBUTIONS 

One possible cause for concern, in addition to systematic 
errors in the measurements themselves, is non-gravitational 
contributions to the flux correlations. These could arise from 
hydrodynamic forces, radiative transfer effects, reionization 
heating o r other departur es from the simple FGPA assumed 
thus far (iMeiksinl [2007). Many of these effects are ex- 
pected to contribute mostly on small scales, with no power 
preferentially on the acoustic scale, and should not bias a 



BAO measurement (though it may reduce the signal-to-noise 
of the measurement). Here we investigate two such non- 
gravitational contributions: spatial fluctuations in the ion- 
izing background radiation and spatial fluctuations in the 
temperature-density relation of the gas as could arise e.g. from 
(inhomogeneous) Hell reionization. As mentioned previ- 
ously, fluctuations in the quasar continuum should be less of 
a concern when we average the cross-spectra from many dif- 
ferent quasar sightlines than when we look at the auto -spectra 
of individual sightlines (IViel et al.ll20"oHlWhitdl2003l) . 

4.1. Fluctuations in the ionizing radiation 

One possible contributor of large-scale power is fluctua- 
tions in the UV background field or the photo-ionization rate 
(T). Since the attenuation length of the IGM at z ~ 2-3 is 
O(100Mpc), and the background is thought to be dominated 
by rare sources (OSOs), T may have spa t ial structure on large 
scales dZuolll992l iFardal & Shulfl[l993t iGnedin & Ha milton 
1 20021: IMeiksin & White! [2004; Croft! l2004t iMcDonald etafl 
I2OO5I) . Assuming the IGM is in photo-ionization equilibrium 
the optical depth, r oc T" 1 . Slosaret al] (120091) investigated 
the effects of a fluctuating ph oto-ionization rate in a highly 
simplified model. We follow Slos ar et al.l (120091) in assum- 
ing that the ionizing background field is dominated by light 
from quasars, but extend the analysis to consider diagnostics 
of these additional fluctuation sources. 

We place our quasars at random within the volume with 
luminosities drawn from a luminosity function 



$ oc 



(L/L+T a + (L/L*yP 



(11) 



with a = -3 .3 1 and /3 = -1 .09 dCroom et al.l2004D . Each QSO 
is assumed to emit isotropically with constant luminosity L, so 
the contribution to the photo-ionization rate from the ith QSO 
at distance r, can be taken to be 



T/ oc Lj 



e -n/ro 
4irr} 



(12) 



negle cting finite lifetimes or light-cone effects (see ICroftl 
I2004L for further discussion of these issues). Here ro ~ 
O(100/z -1 Mpc) is the 'attenuation length' of the IGM, which 
is a rapidly decreasing fu nction of redshift. 

IMeiksin & White! d2004l) find, for a ACDM cosmology, that 
the (comoving) attenuation length is approximately 



A) : 



■2 x 10 4 (l+z)" 3 ' 2 /i _I Mpc 



(13) 



for 2.75 < z < 5.5, with diffuse gas and Lyman Limit Sys- 
tems contributing approximately equally. This gives a co- 
moving attenuation length of ro « 200/i _1 Mpc at z = 3, with 
larger values at lower z. This result appears to agree with 
more re cent measurements, for example the recent measure- 
ment of Procha ska et al.l (120091) at higher z agree with the re- 
sults above where they overlap. In lSlosar et alJ ([2009) a value 
of ro = 100/r'Mpc was chosen, to emphasize effects at the 
acoustic scale. Here we use ro « 200/r'Mpc, which is closer 
to the value inferred above, neglecting the dependence on red- 
shift. Increasing ro suppresses the photo-ionization fluctua- 
tion auto-correlation function on small scales while increasing 
it on the (l arge) scales of interest (in agreement with analytic 
arguments; Zuo 1992). This makes the impact of a fixed-rms 
T fluctuation larger at the acoustic scale, further decreasing 
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the contrast of the acoustic peak. For simplicity we have made 
the approximation that ro is the same for all quasars. 

To the particular realization of the QSO component we 
add a uniform piece (to mo del the emission from faint AGN, 
galaxies and the IGM itself; Meiksin 2007) such that the rms 
fluctuation of the total is 10% and divide our original t in ev- 
ery pixel by the ionization rate at that location. The overall 
normalization is rescaled to match the mean flux, F . 

Ideally we would have used the positions of the dark mat- 
ter halos for the sources of our photo-ionizing flux. However 
we do not have halo information for the Roadrunner simula- 
tions. To test for the possible effect of quasar clustering on our 
T(i) we looked at another set of simulations, run using a high 
force-resolution TreePM code dWhitd 120021) in 1.25/i _1 Gpc 
cubes, for whi ch we did ha ve stored halo catalogs. Using 
the model of ICrotonl d2009h we assigned quasar luminosi- 
ties to each halo, with the lowest luminosity quasar having 
L/L+ ~ 2%. The resulting luminosity function and cluster- 
ing agree well with observations at z ~ 2- 3. For both these 
'quasars', and a set with randomized positions but the same 
luminosities, we computed T{x) (as above) for 10 4 randomly 
placed skewers and hence £r- Comparison of £r for the halo- 
produced T(i) and the position-randomized T(x) gives us a 
measure of the effect of source clustering. There is significant 
variation, box-to-box, in £r but the mean at r ~ 100/r'Mpc 
is increased by quasar clustering by ~ 30(20)% when ro = 
100(200)/i _I Mpc. These are small increases in £r> but the 
fact that the halos trace the density field induces correlations 
between V and S m which are missed by our random-position 
approximation and which we cannot adequately probe. It 
remains an open question whether matter-correlated source 
clustering is important for V fluctuations. We also note that 
ro is becoming an appreciable fraction of the simulation box 
length in our main simulations, raising questions of numeri- 
cal convergence. It would be nice to confirm and extend this 
analysis with larger simulations including the halo catalogs, 
and we plan to return to this in a future publication. 

4.2. Fluctuations in the IGM temperature 

In addition to the ionizing background fluctuations de- 
scribed above, the Unive rse undergoes Hell reionization by 
bright QSOs around z ~ 3 (Meiksin 2007). This event can heat 
regions of the IGM by as much as 25, 00 K, with large tem - 
perature fluctuations on 50Mpc scales (McO uinn et al.l2009l) . 
The large QSO bubbles responsible for Hell reionization are 
essentially uncorrected with the overdensity at a given loca- 
tion and no value of 5 is p referentially ionized at a given time 
during Hell reionization (McOuin n et al.l 120091) . This leads 
to significant scatter in T at fixed 5 which is approximately 
"random". 

In order to mock up such a situation in as simple a manner 
as possible, we randomly throw 1 , 000 centers within the box 
and assign each a radius of influence equal to the mean inter- 
center separation and a temperature drawn from a log-normal 
of mean 2 x 10 4 K and <ri n r = 1. The temperature at mean 
density at any point in the box is then the weighted sum of 
these values, with a Gaussian weight depending on distance 
from each center. This gives fluctuations in the temperature at 
mean density which are large, but highly coherent over large 
distances and smoothly varying. 

4.3. Effects on A£ F 
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FIG. 9. — The compensated flux correlation function, A£f , vs. scale for 
our fiducial model (solid), a model with r-fluctuations as described in the 
text (dotted) and a model with fluctuating To (dashed). The large "extra" 
power which dominates £p in the latter two models is removed in A£f, and 
the acoustic signal remains visible though reduced in contrast. 

Both the fluctuations in the photo-ionization rate and mod- 
ulation of the mean temperature give rise to large changes in 
the corr elati on function, similar to what was seen for an evolv- 
ing F ( 133.41 ). As there, much (but not all) of the 'extra' signal 
is smooth and can be removed by considering a compensated 
statistic. Figure|9]shows the effect of T and To fluctuations on 

4.4. Diagnostics 

Diagnostics of these non-gravitational contributi ons can be 
found i n the higher momen ts of the flux dZaldarriag a et alj 
1200 lbt iFang & Whitel 120041) and all indications are that the 
forest is dominated by gravitational instability on large scales. 
However, the errors on such measurements are still large 
enough that the issue is not settled. 

The existing measurements of inter-scale correlations were 
done on individual spectra by computing the cross-power be- 
tween a power of the (band-pass) filtered flux field and the 
original flux field. The auto- and cross-power spectra can 
be related to the power spectrum, bispectrum and trispectrum 
of the flux. Gravitational instability induces a positive cor- 
relation between large- and small-scale density fluctuations, 
which leads to a negative correlation coefficient, C(k), be- 
tween the filtered field and the original flux. 

In the absence of non-gravitational contributions C(k) w 
on scales smaller than the cut-off in the band-pass filter and 
quickly tends to negative values on larger scales. The value 
of C(k — > 0) depends primarily on the width of the band-pass 
filter, being more negative the wider the filter. Existing obser- 
vations of a handful of high S/N, high resolution spectra are 
consistent with the structure produced in gravity-only simula- 
tions using the FGPA. 

We find that in the presence of r-fluctuations the cross- 
correlation, C(k), is altered by a small amount at k~ l ~ 
100/i _1 Mpc for band-passes focused on small scales. Such 
an alteration would have been totally unmeasurable in the ex- 
isting analyses, and whether it is observable in the presence 
of unknown continuum structure and other observational non- 
idealities remains in doubt. The effects of large-scale coher- 
ent fluctuations in To are also quite small in this statistic and 
would not have been seen in existing analyses. 
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FIG. 10. — The 3-point flux correlation function in configuration space for 
isosceles triangles with r\i = m = 50/r'Mpc vs. for the fiducial model 
(solid), a model with T-fluctuations as described in the text (dotted) and a 
model with fluctuating Tq (dashed). An overall amplitude has been removed 
by normalizing to f with f-23 = r\ij1. Note that the model with fluctuating Tq 
shows a different scale dependence than the other models. 

The basic idea of measuring the higher order correlations 
and looking for gravitationally induced signatures remains 
valid however. With upcoming BOSS data as a motivation, 
we here present preliminary results on the large-scale 3-point 
cross-correlation function along three different lines-of-sight. 
Such a statistic will be less susceptible to noise and continuum 
fluctuations than the measurement discussed above. 

There are many possible configurations for which we could 
evaluate the flux 3-point function, and we could evaluate it 
in configuration or Fourier space. We choose here to use 
an isosceles triangle in configuration space. Particularly for 
Tq fluctuations from He reionization, we might expect con- 
figurations where two vertices are in one bubble and one is 
in another to show a large non-gravitational signal. Figure 
[TOl shows the 3-point flux correlation function in configura- 
tion space for isosceles triangles with r\i = = 50/r'Mpc 
vs. r23. Our simulations show that there is a large variance in 
the 3-point function from box to box, and smaller equations- 
of-state, 7, lead to a steeper dependence of £ on ^3. The 
model with fluctuating V cannot be distinguished from pure 
gravity models with 1 < 7 < 1 .5 based on the shape depen- 
dence shown here. Similarly increasing the temperature uni- 
formly to 3 x 10 4 K gives a result close to the solid line in 
Figure \W\ However, the model with To fluctuations shows 
a much flatter dependence on r23 than the others, enabling a 
test of the non-gravitational nature of the fluctuations. We 
have chosen to plot r\2 = 50/r'Mpc since on smaller scales 
the 3-point function is very similar for all of the models (as 
expected) while on large scales the results become very noisy 
though the model differences appear to be accentuated. In 
principle, a survey like BOSS would be able to detect the dif- 
ference with high significance. Whether these differences can 
be determined in the presence of other sources of noise and 
modeling uncertainty, how the signals scale with the model 
parameters and what the best higher-order statistic is, all need 
to be investigated and we hope to return to this in future work. 

5. DISCUSSION AND CONCLUSIONS 

Baryon acoustic oscillations have become one of the pre- 
mier methods for determining cosmological distances and 



hence the expansion history of the Universe. The structure in 
the spectrum of distant quasars, which is thought to trace the 
structure of the IGM at near mean density, has been suggested 
as a relatively cheap method for measuring BAO at high red- 
shift. Motivated by the upcoming BOSS experiment, and the 
availability of the world's fastest supercomputer, "Roadrun- 
ner", we have performed a set of ultra-large particle-mesh 
simulations in order to study the BAO signature in the inter- 
galactic medium. 

From the density and velocity fields we have used the "fluc- 
tuating Gunn-Peterson approximation" (FGPA) to produce 
mock Lyman-a spectra. These simulations are the first to 
be able to cover the large volume necessary to constrain the 
acoustic scale while simultaneously resolving the Jeans scale 
of the gas, and the small-scale power and pixel distribution in 
our mock spectra approximately match those seen in observa- 
tions at z — 2.5. We anticipate that these mock spectra will 
be very useful in testing pipelines, calibrating analysis tools 
and planning future projects so we have made them publicly 
availably. 

In the simulations the Lyman-a flux follows the mass fluc- 
tuations on large s cales, with negligible scale-dependent bias 
dSlosar et al. 2009). This is in some sense to be expected. We 
have assumed the FGPA in producing our mock spectra and 
within this approximation the flux is a highly non-trivial but 
deterministic function of the underlying density field. 

Since the small-scale statistics in our simulations approx- 
imately match observations, we are able to make a reason- 
able estimate of the error in the flux auto-correlation function. 
We discuss the scaling of the error with the number of quasar 
sightlines and the noise in each spectrum. In the limit of low 
noise and many sightlines the error on the correlation function 
is well approximated by the Gaussian expression. 

We have presented preliminary investigations of an evolv- 
ing mean flux, fluctuations in the photo-ionization rate and 
Hell reionization which generate "extra" power on the acous- 
tic scale and reduce the contrast of the acoustic peak. Gravita- 
tional instability produces a well-defined pattern of higher or- 
der correlations which is not obeyed by non-gravitational con- 
tributions such as the above, allowing (in principle) a diagnos- 
tic of non-gravitational physics in the forest. As an example, 
we demonstrate that the 3-point cross-correlation function in 
models with Hell reionization has a different scale depen- 
dence than the 3-point function in gravity-only simulations, 
regardless of the equation-of-state assumed in the latter. 
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APPENDIX 
SCALING WITH NUMBER DENSITY 

The distribution of quasars on the sky determines at what places we sample the underlying density field. As the number density 
of quasars increases so does the number of samples with which we can constrain P(k) or £(r), and hence the fidelity of the 
measurement, but eventually we are limited by the available modes within the survey volu me. H ere we describe how that limit is 
approached for a Gaussian field, which is a reasonable approximation on large scales (see 33.3b - 

To begin we consider a large periodic volume of side length L and volume (or area in 2D) V. Our Fourier transform convention 
has 

S(k) = V~ 1 [ dV <5(r)exp[/k-r] , S(x) = V«5(k)exp[-/k-r] (Al) 
Jv 



k 



both dimensionless and P(k) = V ^|<5(k)| 2 ^). To model the finite sampling we multiply the density field, 5(x), by a window 
function, w(x), which is non-zero only where the field is observed, leading to 

P(k) = ^>(k')|w k _ k ,| 2 (A2) 

k' 

We shall approximate w(x) as a sum of (randomly distributed) 2D <5-functions, or w(k) as a sum of plane waves. If we average in 
a shell Si with ki-Ak/2 < |k| < jfc,-+AJfc/2 we obtain 

= jr. 12 % Sk " w k-k' Wk ~ k " ( A3) 

' keSik'k" 



with mean 



jJrEE^I^I 2 (A4) 
' keSi k' 



and covariance 



Cov 



Pi,Pj\ =J^f.2^2^2^ P{k')P(k") w k2 _ k ,w k ,_ kl w kl _ k »w k »_ k2 . (A5) 

' 1 ' kieS/k 2 eSjk'.k" 



where Nj is the number of modes in the shell and we have explicitly assumed P depends only on |k| (e.g. we have neglected 
redshift space distortions). 

If w(x) is a sum of rt\ os ^-functions, whose positions we average over, the product of w(k)'s in Eq. iA5\ becomes a sum of 
Kronecker 6's. There are 4 kinds of terms (depending on the number of points which coincide) which scale as 1, 1 /«i os , 1 /n 2 os 
and 1 / «j 3 os respectively. In 2D in the continuum limit and for narrow bins the result becomes 

Var[£] = 1 Pikd 2 ( 1 + (A6) 



Ni V nP(ki) 

where a 1 is the density variance and n is the line-of-sight number density. The ratio of the terms going as 1 and 1/n determines 
at what number density we approach the sample variance limit. 

In 3D the situation is slightly more complex because the window function has no structure in the line-of-sight direction, which 
gives a ^-function in &i os , and the spherical average involved in i^_gives more structure to the expression. In the continuum limit, 
the approach to the sample variance limit is governed by the ratioj 

vi 7r f d 3 k 



jP(k)W(k) (A7) 
nvo nkiP(ki) J (IttY 

where the v, are defined in Eq. (|6]l and W(k) is unity for |k| < ki and fc,-/|k| for |k| > kj. For ACDM cosmologies the integral 
is O(0. 01) on the scales of interest. The additional variance depends then upon n & sP (kfi/Po where n e ff = kn.2D ~ n/X and we 
have included the Po factor to indicate that the ratio is independent of the normalization of P(k). W e note that this expression is 
qualitatively similar to, but not the same as, the expression given in McDonald & Eisenstein (2007, Eq. 13). 

For small kj the integral above can be approximated by the ID power spectrum (an approximation that becomes worse as 
k — > oo). With that approximation the ratio can be written as n c rit/«, with the critical (2D) number density being 

where <A» 



5 Including the next-order correction involving V2 is important for quantitative accuracy. 
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which depends on the shape of the power spectrum but not its normalization. As expected, cosmologies with more small- 
scale power require a higher number density of skewers to reach the sample variance dominated limit. For our model n CT n ~ 
0.01 /! 2 Mpc~ 2 , in agreement with the trends seen in Figure|6] 
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